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Abstract 


Background: The coronavirus 3 chymotrypsin-like protease (3CL"'°) is a validated target in the design of potential 
anticoronavirus inhibitors. The high degree of homology within the protease’s active site and substrate conservation 
Supports the identification of broad spectrum lead compounds. A previous study identified the compound ML188, 
also termed 16R, as an inhibitor of the Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) 3CLP™. This study 


will detail the generation of a homology model of the 3CL 


oro 


of the human coronavirus OC43 and determine the 


potential of 16R to form a broad-spectrum lead compound. MODELLER was used to generate a suitable 
three-dimensional model of the OC43 3CL°° and the Prime module of Schrddinger predicted the binding 
conformation and free energy of binding of 16R within the 3CL°® active site. Molecular dynamics further 
confirmed ligand stability and hydrogen bonding networks. 


Results: A high quality homology model of the OC43 3CL"° was successfully generated in an active conformation. 
Further studies reproduced the binding pose of 16R within the active site of the generated model, where its free 
energy of binding was shown to equal that of the 3CL"'° of SARS-CoV, a receptor it is experimentally proven to inhibit. 
The stability of the ligand was subsequently confirmed by molecular dynamics. 


Conclusion: The lead compound 16R may represent a broad-spectrum inhibitor of the 3CL”'° of OC43 and potentially 
other coronaviruses. This study provides an atomistic structure of the 3CLP° of OC43 and supports further experimental 
validation of the inhibitory effects of 16R. These findings further confirm that the 3CL°° of coronaviruses can be 

inhibited by broad spectrum lead compounds. 
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Background 

Human coronaviruses have a worldwide distribution and 
are frequently associated with self-limiting upper re- 
spiratory tract disease or “the common cold”. They can, 
however, also present with high morbidity outcomes of 
the lower respiratory tract including bronchiolitis, pneu- 
monia, [1-3], asthmatic exacerbations [4] and acute ex- 
acerbations of chronic obstructive pulmonary disease 
(COPD) [5], where human coronavirus OC43 has been 
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shown to be the prevalent strain in several regions [6]. 
Given the high burden of coronaviruses to human 
health, and their potential for genetic recombination to 
give rise to the emergence of completely novel strains, 
the presence of antiviral strategies is paramount. How- 
ever, there is currently no commercially available mo- 
lecular entity which is capable of inhibiting infection 
with human coronaviruses [7]. The 3CL?"® has been vali- 
dated as an effective drug target in several studies and 
has even been termed “the Achilles’ heel of corona- 
viruses” [8] making it an ideal target for the identifica- 
tion of novel lead compounds. 
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During coronavirus replication the large open reading 
frame (ORF) 1a/lab genes, located at the 5’ end of the 
genome, are responsible for expressing two large replicase 
polyproteins (pp). These are co- or post-translationally 
cleaved by the virally encoded 3CL?"® and Papain-like pro- 
tease to yield 16 non-structural proteins responsible for 
viral replication [9]. The 3CLP*° is also termed the main 
protease as it cleaves a total of 11 cleavage sites within 
ppla and pplab [10], in comparison to only three cleavage 
sites predicted for the papain-like protease [11]. 

The 3CLP"® has three distinct domains. Domains I and 
II are largely composed of several anti-parallel 6 barrels 
[12] and is connected to domain III by a long loop re- 
gion. Domain III is composed of several globular a- 
helices and plays a role in protein dimerization, along 
with the N-terminal N-finger (residues 1-7). The active 
site is located in a chymotrypsin-like fold between do- 
mains I and II and contains a catalytic dyad of His41, 
which acts as a proton acceptor, and Cys144/5, which 
undergoes nucleophilic attack on the carbonyl carbon of 
the substrate [8,13]. The 3CL?’® is so named for its close 
structural and sequence homology to the 3 chymotryp- 
sin protease (3C?"°) of rhinoviruses, which contains a 
catalytic triad composed of His, Cys and Glu or Asp. 
Superimposition of the structures of 3CLP™® and 3CP*° 
shows that the His and Cys of both proteases is almost 
perfectly aligned, which may explain the similar sub- 
strate specificity and catalytic mechanism [14]. The pos- 
ition of the Glu/Asp in 3C?"® is, however, replaced by a 
water molecule in 3CLP"°, which shares three hydrogen 
bonds with surrounding residues including the catalytic 
dyad member His41. It is assumed that this water mol- 
ecule is responsible for stabilization of the histidine in 
the intermediate state during enzymatic cleavage [8]. 

All coronavirus 3CLP"° share a high sequence hom- 
ology, as well as main chain architecture and substrate 
conservation [15,16]. The substrate binding site of the 
3CLP"° has two deeply buried S; and S, subsites, with 
shallow S,; $3 and S, subsites with varying degrees of 
solvent exposure. Substrate specificity of coronavirus 
3CLP" is mainly determined by the Pj, Py and P}: posi- 
tions [16]. The P, position has an absolute specificity for 
glutamine which stabilizes the S, subsite via a hydrogen 
bond with the imidazole Ne2 of His162/3 and van der 
Waals interactions with surrounding residues of the S, 
pocket. The P, site has a preference for leucine or me- 
thionine to fill the hydrophobic Sz pocket. The side- 
chains of the S3 site are solvent exposed and therefore 
this site is expected to tolerate a wide range of function- 
ality, but shows a preference for basic residues [17]. 
Sidechains and backbones of residues surrounding the 
S4 Site create a highly congested pocket which favors a 
small, hydrophobic residue in the P, position, either Ser, 
Thr, Val or Pro [17-19]. The S; and S, subsites also 
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accommodate small residues in the P; and P» positions, 
which may include Ser, Ala or Gly [18,20]. A typical 
cleavage recognition site is therefore (Ser, Ala)-(Val, 
Thr)-Leu-Glu | (Ser, Ala, Gly), which is conserved 
among all coronavirus 3CL’’® [21]. Theoretically, since 
these viruses share a highly homologous binding site 
and substrate conservation, it is reasonable to antici- 
pate that broad spectrum inhibitor strategies might be 
successful. [13]. 

A noncovalent, furyl amide ligand, 16R, which was 
identified by a high throughput screen, with subsequent 
lead optimization based on structure-activity relation- 
ships within the $,, S; and S» binding pockets was 
shown to inhibit the 3CL?"° of SARS-CoV at an IC50 of 
1.5+0.3 uM [12]. This study will detail the generation of 
a theoretical model of the OC43 3CLP" based on a high 
sequence homology with the solved 3CL?® structure of 
HKU1 and assess the potential of the inhibitor, 16R, to 
inhibit this complex in a broad-spectrum manner. This 
study will therefore provide further evidence to support 
the identification of broad spectrum 3CL?"® inhibitors. 


Results and discussion 

Homology modelling of the OC43 3CLP° 

The 3CLP” structure of HKU1 [PDB: 3D23] displayed a 
high sequence identity of 82.3% to the 3CL® of OC43, 
with an e-value of 0.0. The high degree of identity can 
be partly expected as both OC43 and HKU1 are human 
coronaviruses from the Betacoronavirus genus. The ex- 
ceptionally high degree of identity may even further sug- 
gest a recent common ancestor, which has yet to be 
identified. The active site residues are also highly con- 
served between both sequences indicating that 3D23 
forms a highly suitable template for model generation 
(Figure 1). 

Homology models were built with MODELLER (9v10) 
[22,23] where the lowest discrete optimized protein en- 
ergy (DOPE) score corresponded to model five with a 
GA341 score of 1, indicating that the model quality cor- 
responded with low resolution crystallographic struc- 
tures. The DOPE score profile of target and template 
(Figure 2) were nearly perfectly overlaid, indicating that 
the model was close to its native state. A peak in DOPE 
score for HKU1 3CLP"® (3D23) was observed at approxi- 
mately residue 50, where OC43 3CLP"° showed a moder- 
ate conservation in DOPE score. Colouring the HKU1 
3CLP*® (3D23) structure by B-factor indicates the pres- 
ence of a highly variable loop region from Ser46 to 
Asp53 (Figure 3). The presence of this highly variable 
loop structure could explain the increase in the DOPE 
score profile in this region and may suggest that the 
homology model has assumed a more stable conform- 
ation than the template. Structural alignments where 
the root mean square deviation (RMSD) is below 2 A 
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Figure 1 Pairwise sequence alignment of OC43 3CLP° with the template structure of 3D23. Sequence alignment revealed a high identity of 
82.3%, Asterisks indicate conserved residues between target and template. Conserved active site residues are highlighted in red. Important 
residues within the oxyanion loop (yellow), S,; pocket (blue) and S2 pocket (black) are also highlighted to display high degree of conservation 


within the active site. 


between target and template indicates that the posi- 
tions of all backbone elements are correct [24,25]. 
Superimposition of the 3D23 template and modelled 
OC43 3CL?"® structure displayed an RMSD of 0.327 A 
suggesting a highly accurate prediction of the position 
of all backbone elements (Figure 4). Analysis of the 
overall model quality of target and template by ProSA 
Z-score indicated that both fall within an acceptable 
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Figure 2 DOPE score profiles of template, 3D23, and homology model 
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range for crystallographic structures with a Z-score for 
3D23 of —7.04 and -—7.34 for the homology model of 
OC43 3CL?P"° (Figure 5). Stereochemical analysis of 
phi-psi dihedral angles indicated that 91.8% of residues 
were in the most favoured regions with none in the dis- 
allowed regions (Figure 6). 

For the latter purposes of this model, in characterization 
of the inhibitory potential of 16R, it is essential to confirm 
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of OC43 3CL"'®. General overlay of profiles indicates the generated model 
is close to its native structure. The spike at residue 50 corresponds to a variable loop structure for which OC43 3CLP° has assumed a more 
stable conformation. 
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Figure 3 Location of highly variable loop region in the 3CL?'° of HKU1 (3D23). Colouring of backbone elements by B-factor indicates the 
presence of a highly variable loop from residues Ser46 to Asp53. This variable loop region may be responsible for the spike in DOPE score for 


3D23, for which OC43 3CL?° has assumed a more stable conformation. 


that the receptor in its active conformation. To ensure the 
modelled 3CL?*° of OC43 is in its active conformation, 
several features can be used to differentiate between the 
active and inactive states. These features include the main- 
tenance of a loop conformation of the loop connecting 
domain II and III and the oxyanion loop. If these features 
assume an alpha helical conformation it directly or indir- 
ectly leads to the collapse of the oxyanion hole and 
thereby forms an inactive state [26]. An alpha helix con- 
formation of residues 186-190 in the loop region between 
domain II and III causes a further collapse of the S. and 


S, subsites [19]. The orientation of the His163 sidechain 
in the S; pocket is also vital for substrate binding [27]. 
This residue is largely kept in place by a stacking inter- 
action with the benzene ring of Phe140 and imidazole ring 
of His163 [26]. The orientation of His172 is also stabilized 
by a hydrogen bond with the sidechain of Glu166. This 
prevents steric interactions between His172 and His163 
allowing it to maintain its orientation in the S; pocket. A 
hydrogen bond between His163 and Tyr161 has also been 
implicated in stabilizing the sidechain of His163 [27], 
however we could not observe the formation of this bond 


Figure 4 Superimposition of 3D23 and homology model of OC43 3CLP™. Blue ribbon represents the template structure of 3D23 with green 
representing the homology model of OC43 3CLP"°. Superimposition shows a high degree of structural homology with a low RMSD of 0.327 A. 
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Figure 5 Overall quality of the model as assessed by Z-score. (a) Z-score for crystallographic model of 3D23. (b) Z-score for the homology model 


of OC43 3CLP°. A strong correlation between template and target structures indicates an accurate model. 
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Figure 6 Stereochemical analysis of phi-psi dihedral angle of the OC43 3CL?'°. Ramachandran plot generated by PROCHECK indicates that 91.8% 
of residues are located in the most favoured regions with none in the disallowed regions. Based on analysis of 118 solved structures, a good 
model is expected to have above 90% of residues in regions A, B and | [35]. 
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in the crystal structure of SARS-CoV 3CLP" (3V3M), rais- 
ing the question of its importance to maintain the orienta- 
tion of His162/3. 

Utilising these previously mentioned parameters it is 
possible to ascertain if the generated homology model of 
OC43 3CLP"® is in an active conformation. With the ex- 
ception of the hydrogen bond between Tyrl61 and 
His163, which may form in a dynamic system as the 
Tyr161 hydroxyl is in close proximity to the imidazole 
nitrogen of His163 but with incorrect geometry to form 
a hydrogen bond, all other interactions and conforma- 
tions were maintained indicating that the homology 
model generated using the 3CL?® structure of HKU1 
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(3D23) as a template is representative of the active con- 
formation of the enzyme (Figure 7). With this and previ- 
ously mentioned results, indicating the generation of an 
appropriate homology model of the OC43 3CL, suggests 
the generated structure is suitable for further structure- 
based drug design techniques. 


Assessing the binding conformation and free energy of 
binding of 16R 

Analysis of the crystallographic structure of the SARS- 
CoV 3CL?P", with the bound inhibitor 16R, illustrates a 
pyridine occupying the S, pocket and forming a hydro- 
gen bond with the imidazole Ne2 of His163. A further 
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Figure 7 Features present in the homology model of OC43 3CL"’° which represent of an active state of the enzyme. (a) Orientation of His163 is 
essential for substrate binding. This orientation is maintained by stacking interactions with Phe140 and a hydrogen bond with Tyr161. The 
importance of this bond is questionable as it is not observed in all crystallographic models. (b) Steric interactions between His172 and His163 
disrupt the active conformation of His163. To prevent this, His172 is stabilized by a hydrogen bond with Glu166. (c) Maintenance of loop 
structures of the oxyanion loop (blue) and the loop connecting domain II and Ill (red) are essential in stabilizing the oxyanion hole. The general 


three domain structure of all 3CL°'° is also depicted. 
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two hydrogen bonds are formed between ligand car- 
bonyls and Gly143 and Glu166 with three methyl groups 
inserting into the deep hydrophobic Sz pocket [12]. The 
conformation of 16R is almost completely conserved 
when bound to the 3CLP’® of OC43. Notable exceptions 
include the shifting of the hydrogen bond formed with 
Gly143 to a furan ring oxygen as opposed to the car- 
bonyl oxygen seen in SARS-CoV 3CLP". The hydrogen 
bond formed with Glul66 is also absent in OC43 
3CLP", however the distance between the ligand car- 
bonyl and backbone hydrogen is 2.75 A and therefore 
this may be capable of mediating hydrogen bond forma- 
tion in a dynamic system (Figure 8). Free energy of bind- 
ing between 16R and SARS-CoV and OC43 3CL?P", as 
assessed by MM-GBSA, is -85 kcal/mol for both recep- 
tors. These results suggest that the inhibitor 16R may be 
capable of inhibiting both complexes. 


Molecular dynamic simulation of OC43 and 16R 

To further confirm the findings described in the previ- 
ous section, and to assess the potential of Glul66 to 
form a hydrogen bond in a dynamic system, molecular 
dynamic simulations were used. Molecular dynamics of 
the SARS-CoV 3CLP"°-16R complex are in strong agree- 
ment with the crystal data of 3V3M [12]. The simulation 
predicts an average of 3.17 hydrogen bonds formed over 
the 10 ns trajectory, with these bonds predominantly 
forming with Glul66, His163 and Gly143. The bond 
formed at Gly143 does however display the potential to 
form a bifurcated interaction with Asn142. As predicted 
by the previous results, detailed in the section describing 
the binding conformation, the 3CLP"°-16R complex of 
OC43 displayed a high occupancy of 100% and 85% at 
both His163 and Gly143 positions respectively. These re- 
sults however suggested that the bond formed with 
Gly143 was mediated via the furan ring oxygen, where 
molecular dynamic simulations indicated that this bond 
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was formed between the carbonyl oxygen and Gly143, as 
seen with the crystal structure of SARS-CoV. Molecular 
dynamics also confirmed that the hydrogen bond at the 
Glu166 does not stably form in a dynamic environment 
and only displays the potential to form at 0-2 ns and 4- 
5 ns, with an overall occupancy of 21.62%. 

The stability of the bound ligand within the active 
pocket was also assessed via RMSD, radius of gyration 
and changes in solvent accessible surface area. A RMS 
deviation below 2 A is an indication of a ligand which is 
stably bound to its receptor [28], where an increase in 
radius of gyration from the starting reference would sug- 
gest ligand instability. An increase in solvent accessible 
surface area of the ligand may indicate the ligand is dis- 
sociating into surrounding solvent [29]. Ligand RMSD 
within the active site was approximately 0.75 A and ra- 
dius of gyration and solvent accessible surface area 
remained within a stable range of 3.9 A to 4.0 A and 
4.8 nm’ to 5.4 nm” respectively (Figure 9). These param- 
eters suggest that the ligand is highly stable within the 
active site. 


Conclusion 

The three-dimensional structure of the OC43 3CLP"® 
was successfully solved by homology modelling with 
MODELLER. Analysis of various side chains and loop 
conformations within and surrounding the substrate- 
binding site is indicative of an active conformation of 
the enzyme. The solved structure has been made publicly 
available in the Protein Model Database (PMDB; https:// 
bioinformatics.cineca.it/PMDB/) [PMDB ID: PM0079872]. 
Further analysis of a previously identified lead compound, 
16R, suggests that it makes extensive and stable interac- 
tions with the 3CL?"® binding site of OC43. Additional 
analysis of the free energy of binding by MM-GBSA 
suggests that the ligand binds with the same affinity to 
the 3CLP"® of OC43 as it does to SARS-CoV, a receptor 
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Figure 8 Binding conformation of 16R within the active pocket of the 3CL®’®. (a) Pose of 16R within the active site of the SARS-CoV 3CLP°. 
Hydrogen bonds (yellow dashes) are formed with Gly143, His163 and Glu166. (b) 16R assumes a similar pose when bound to the OC43 3CLpro. A 
notable difference is the loss a hydrogen bond with Glu166. Hydrogen bonds with Gly143 and His163 are however maintained. 
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Figure 9 RMSD and solvent accessible surface area of 16R when bound to the OC43 3CLP. Analysis of ligand RMSD and solvent accessible 


surface area depicts a ligand which is stably bound to its receptor. 


it is experimentally proven to inhibit. These results are 
strongly suggestive that the lead compound, 16R, not 
only displays the potential to inhibit the 3CL?’° of 
SARS-CoV but also that of the highly homologous OC43. 
In additional studies (data not shown) 16R was shown to 
remain stably bound to the 3CLP"® of 229E and NL63 in 
molecular dynamic simulations and may therefore even 
display the potential to inhibit the 3CLP"° of additional 
coronaviruses. 16R is therefore an excellent lead com- 
pound in the pursuit of true broad spectrum inhibitors of 
all coronaviruses. 


Methods 

Homology modelling of the OC43 3CLP° 

The amino acid sequence of the OC43 3CLP"° was ob- 
tained from the NCBI database [GenBank: AEN19363]. 
Fold assignment with the profile.build module of MOD- 
ELLER (9v10) identified the most suitable template 
structure from PDB with an 82.3% sequence identity and 
e-value of 0. Template-target alignment was conducted 
with the align.2d module, which also incorporates struc- 
tural information from the template. Models of the 
OC43 3CLP"® were built by satisfaction of homology de- 
rived spatial restraints on distances and dihedral angles 
[23], stereochemical restraints obtained from the 
CHARMM22 force field [30] and statistical preferences 
on dihedral angles and non-bonded distances obtained 
from known protein structures [22,31]. Models were 
built by the automodel class by minimizing the viola- 
tions on all restraints and united-atom PDB models were 
generated. 

DOPE scores are assigned by considering the positions 
of all non-hydrogen atoms, where the lowest DOPE 
score corresponds to the model that is predicted to be 
most accurate [31]. The evaluate_model.py script was 
further utilised to calculate per residue DOPE scores, 
which were superimposed over the template structure 
DOPE scores. A general conservation in the pattern of 
respective DOPE scores was used as an indication of an 


accurate model. A low RMSD, as assessed by PYMOL 
[32], indicated structural homology between solved 
structure and template. The GA341 score was used to 
analyse fold-assignment. Z-scores, obtained from ProSA- 
web [33,34] were used to analyse overall model quality 
and to assure template and query models were in a 
range acceptable for structures of their respective size. A 
Z-score outside a range characteristic for native proteins 
will suggest the production of an erroneous model [33]. 
Stereochemical analysis of dihedral angles was then fur- 
ther conducted using PROCHECK [35,36] on models 
corresponding to the lowest DOPE score with GA341 
scores amounting to 1.0. Stereochemical analysis by 
PROCHECK assessed residue geometry, where more 
than 90% of residues should be located in the “most 
favoured region” to indicate a stereochemically favourable 
structure. 

Once an appropriate model accuracy was achieved 
based on these previous observations, the active site of 
the enzyme was analysed to ensure it was in its active 
state. Several features are able to differentiate an active 
from inactive state of the 3CLP"°. These include the con- 
formation of the oxyanion loop and loop connecting do- 
main II and III [19,26] and the sidechain orientation of 
residues, including His163, Phe140, His172, Tyr161 and 
Glu166, within the S$, pocket [26,27]. 


Assessing the binding conformation and free energy of 
binding of 16R 

Protein structures of 3V3M (3CLP’° of SARS-CoV) and 
the homology modelling derived structure of OC43 
3CLP"° were prepared with the Schrédinger Protein 
Preparation Wizard. Histidine protonation was assigned 
to the epsilon nitrogen in accordance with crystal data. 
A restrained minimization was then performed using the 
OPLS2005 force field [37,38], during which heavy atoms 
were restrained to remain within 0.3 A of the original 
structure. To assess the conformation of the known in- 
hibitor, 16R, when binding to the OC43 3CLP", the 
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experimental structure of 3V3M and OC43 3CLP"® were 
aligned in Maestro, the interface for all Schrédinger’s 
software. The inhibitor, 16R, was subsequently superim- 
posed over the OC43 3CLP"° active site using coordi- 
nates present in the 3V3M crystal structure. Side chains 
of residues in the active pocket assuming orientations 
that resulted in a steric clash with the ligand were re- 
fined and the ligand was minimized with Prime [39,40] 
to better fit the pocket. Prime utilizes both empirically 
and theoretically derived constraints to predict and rep- 
licate the dynamic motion of protein sidechains, allow- 
ing for conformational changes that arise through the 
“induced-fit” phenomenon, which is often neglected in 
molecular docking screens. 

The Prime/MM-GBSA method was used to calculate 
the binding free energy of 16R within the respective re- 
ceptor. The free energy of binding is the calculated dif- 
ference between the minimized _receptor-inhibitor 
complex and the sum of the energies of the minimized 
unbound inhibitor and receptor. Ligand poses were min- 
imized using the local optimization feature in Prime, 
where energies of the complex were calculated with the 
OPLS-2005 force field and GBSA continuum solvent 
model. 


Molecular dynamic simulation of the 3CL?'°-16R complex 

The CHARMM27 all atom force field was assigned to 
receptor structures using the three point TIP3P water 
model. All input hydrogens in the coordinate file were 
ignored and were assigned according to the force field. 
Histidine protonation states were assigned to the epsilon 
nitrogen in the neutral form in accordance with crystal- 
lographic data. The ligand, in the previously predicted 
pose, was prepared by SwisParam [41]. The protein- 
ligand complex was then placed in the centre of a sol- 
vated, cubic box, 1 nm from the box edge. The system 
was solvated with the spc216 water model and sodium 
counter ions were added to neutralize the system. The 
system was energy minimized by a brief dynamics simu- 
lation using Steepest Descent Minimization until stee- 
pest descent converged to Fmax<1000. Long range 
electrostatic interactions were calculated with the Par- 
ticle Mesh Ewald (PME) method [42] with a cutoff of 
1 nm and periodic boundary conditions were applied. 
Both NVT and NPT equilibrations were run for 50 000 
steps or 100 ps using a 2 fs time step and a leap-frog in- 
tegrator. All bonds were constrained by the lincs algo- 
rithm. PME was again used for long range electrostatics 
with a 0.16 fourier spacing. Short range electrostatic and 
van der Waals cutoffs were 1.0 nm and 1.4 nm respect- 
ively. Coordinates, velocities and energies were saved 
every 0.2 ps. During NVT equilibration temperature 
coupling was achieved by V-rescale algorithm with a tar- 
get temperature of 300 K. Pressure coupling during NPT 
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equilibration was achieved via the Parrinello-Rahman al- 
gorithm and is incorporated in the NPT equilibration 
once temperature is stable to ensure the proper density 
is reached (approximately 1000 kg/m”). Following NVT 
and NPT system equilibration an unrestrained, 10 ns 
simulation (5 000 000 steps) was run with the leap frog 
integrator, saving coordinates, velocities and energies 
every 2 ps. All other parameters were maintained from the 
NVT and NPT equilibration including both temperature 
and pressure coupling. 
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